Overview

Dataset Statistics

Number of Variables 18
Number of Rows 6966
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 3.9 MB
Average Row Size in Memory 587.5 B
Variable Types
  • Categorical: 10
  • Numerical: 8

Dataset Insights

rain is skewed Skewed
count is skewed Skewed
date has a high cardinality: 364 distinct values High Cardinality
date has constant length 10 Constant Length
year has constant length 4 Constant Length
dayofweek_n has constant length 1 Constant Length
season has constant length 6 Constant Length
rain has 6325 (90.8%) zeros Zeros

Variables


date

categorical

Approximate Distinct Count 364
Approximate Unique (%) 5.2%
Missing 0
Missing (%) 0.0%
Memory Size 510.2 KB

Length

Mean 10
Standard Deviation 0
Median 10
Minimum 10
Maximum 10

Sample

1st row 2021-03-01
2nd row 2021-03-01
3rd row 2021-03-01
4th row 2021-03-01
5th row 2021-03-01

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 13932
Decimal Number 55728
  • date has words of constant length

hour

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 12.8293
Minimum 0
Maximum 23
Zeros 266
Zeros (%) 3.8%
Negatives 0
Negatives (%) 0.0%
  • hour is skewed left (γ1 = -0.2775)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 8
Median 13
Q3 18
95-th Percentile 22
Maximum 23
Range 23
IQR 10

Descriptive Statistics

Mean 12.8293
Standard Deviation 6.3473
Variance 40.288
Sum 89369
Skewness -0.2775
Kurtosis -0.836
Coefficient of Variation 0.4947
  • hour is not normally distributed (p-value 5.424573195730559e-26)

rain

numerical

Approximate Distinct Count 44
Approximate Unique (%) 0.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 0.0596
Minimum 0
Maximum 10.3
Zeros 6325
Zeros (%) 90.8%
Negatives 0
Negatives (%) 0.0%
  • rain is skewed right (γ1 = 11.4044)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 0.3
Maximum 10.3
Range 10.3
IQR 0

Descriptive Statistics

Mean 0.0596
Standard Deviation 0.3295
Variance 0.1086
Sum 415.2
Skewness 11.4044
Kurtosis 209.2432
Coefficient of Variation 5.5282
  • rain is not normally distributed (p-value 4.317126577897763e-25)
  • rain has 641 outliers

temp

numerical

Approximate Distinct Count 284
Approximate Unique (%) 4.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 10.7424
Minimum -4
Maximum 26.3
Zeros 7
Zeros (%) 0.1%
Negatives 54
Negatives (%) 0.8%
  • temp is skewed right (γ1 = 0.0982)

Quantile Statistics

Minimum -4
5-th Percentile 2.6
Q1 7.025
Median 10.6
Q3 14.5
95-th Percentile 18.775
Maximum 26.3
Range 30.3
IQR 7.475

Descriptive Statistics

Mean 10.7424
Standard Deviation 5.0022
Variance 25.0216
Sum 74831.5
Skewness 0.09824
Kurtosis -0.4063
Coefficient of Variation 0.4656
  • temp is not normally distributed (p-value 1.1961746774938518e-14)
  • temp has 5 outliers

rhum

numerical

Approximate Distinct Count 69
Approximate Unique (%) 1.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 80.5459
Minimum 24
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • rhum is skewed left (γ1 = -0.711)

Quantile Statistics

Minimum 24
5-th Percentile 58
Q1 73
Median 82
Q3 90
95-th Percentile 97
Maximum 100
Range 76
IQR 17

Descriptive Statistics

Mean 80.5459
Standard Deviation 11.9187
Variance 142.0561
Sum 561083
Skewness -0.711
Kurtosis 0.2116
Coefficient of Variation 0.148
  • rhum has 65 outliers

wdsp

numerical

Approximate Distinct Count 33
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 8.8114
Minimum 1
Maximum 35
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • wdsp is skewed right (γ1 = 1.0021)

Quantile Statistics

Minimum 1
5-th Percentile 3
Q1 6
Median 8
Q3 11
95-th Percentile 17
Maximum 35
Range 34
IQR 5

Descriptive Statistics

Mean 8.8114
Standard Deviation 4.3837
Variance 19.2164
Sum 61380
Skewness 1.0021
Kurtosis 1.6437
Coefficient of Variation 0.4975
  • wdsp is not normally distributed (p-value 0.0003026379175468716)
  • wdsp has 208 outliers

day

numerical

Approximate Distinct Count 31
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 15.6405
Minimum 1
Maximum 31
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • day is skewed right (γ1 = 0.0043)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 8
Median 16
Q3 23
95-th Percentile 29
Maximum 31
Range 30
IQR 15

Descriptive Statistics

Mean 15.6405
Standard Deviation 8.6897
Variance 75.5104
Sum 108952
Skewness 0.004253
Kurtosis -1.1777
Coefficient of Variation 0.5556
  • day is not normally distributed (p-value 2.4754078440326383e-76)

month

numerical

Approximate Distinct Count 12
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 6.5576
Minimum 1
Maximum 12
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • month is skewed left (γ1 = -0.0455)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 4
Median 7
Q3 10
95-th Percentile 12
Maximum 12
Range 11
IQR 6

Descriptive Statistics

Mean 6.5576
Standard Deviation 3.4376
Variance 11.8171
Sum 45680
Skewness -0.04554
Kurtosis -1.197
Coefficient of Variation 0.5242
  • month is not normally distributed (p-value 0.0034584559394370364)

year

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 469.4 KB
  • The largest value (2021) is over 5.09 times larger than the second largest value (2022)

Length

Mean 4
Standard Deviation 0
Median 4
Minimum 4
Maximum 4

Sample

1st row 2021
2nd row 2021
3rd row 2021
4th row 2021
5th row 2021

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 27864
  • The top 2 categories (2021, 2022) take over 50.0%
  • The largest value (2021) is over 5.09 times larger than the second largest value (2022)
  • year has words of constant length

holiday

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 476.2 KB
  • The largest value (False) is over 1392.2 times larger than the second largest value (True)

Length

Mean 4.9993
Standard Deviation 0.02678
Median 5
Minimum 4
Maximum 5

Sample

1st row False
2nd row False
3rd row False
4th row False
5th row False

Letter

Count 34825
Lowercase Letter 27859
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (False, True) take over 50.0%
  • The largest value (false) is over 1392.2 times larger than the second largest value (true)

dayofweek_n

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 449.0 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 6966
  • dayofweek_n has words of constant length

dayofweek

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 490.8 KB

Length

Mean 7.1503
Standard Deviation 1.1257
Median 7
Minimum 6
Maximum 9

Sample

1st row Monday
2nd row Monday
3rd row Monday
4th row Monday
5th row Monday

Letter

Count 49809
Lowercase Letter 42843
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0

working_day

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 471.4 KB
  • The largest value (True) is over 2.44 times larger than the second largest value (False)

Length

Mean 4.2904
Standard Deviation 0.454
Median 4
Minimum 4
Maximum 5

Sample

1st row True
2nd row True
3rd row True
4th row True
5th row True

Letter

Count 29887
Lowercase Letter 22921
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (True, False) take over 50.0%
  • The largest value (true) is over 2.44 times larger than the second largest value (false)

season

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 483.0 KB

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row Winter
2nd row Winter
3rd row Winter
4th row Winter
5th row Winter

Letter

Count 41796
Lowercase Letter 34830
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Summer, Autumn) take over 50.0%
  • season has words of constant length

peak

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 473.8 KB
  • The largest value (False) is over 1.86 times larger than the second largest value (True)

Length

Mean 4.6504
Standard Deviation 0.4769
Median 5
Minimum 4
Maximum 5

Sample

1st row False
2nd row True
3rd row True
4th row True
5th row True

Letter

Count 32395
Lowercase Letter 25429
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (False, True) take over 50.0%
  • The largest value (false) is over 1.86 times larger than the second largest value (true)

timesofday

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 491.1 KB

Length

Mean 7.1918
Standard Deviation 1.4242
Median 7
Minimum 5
Maximum 9

Sample

1st row Night
2nd row Morning
3rd row Morning
4th row Morning
5th row Morning

Letter

Count 50098
Lowercase Letter 43132
Space Separator 0
Uppercase Letter 6966
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Afternoon, Morning) take over 50.0%

rain_type

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 491.3 KB
  • The largest value (no rain) is over 18.82 times larger than the second largest value (drizzle)

Length

Mean 7.2278
Standard Deviation 1.1002
Median 7
Minimum 7
Maximum 13

Sample

1st row no rain
2nd row no rain
3rd row no rain
4th row no rain
5th row no rain

Letter

Count 43719
Lowercase Letter 43719
Space Separator 6630
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (no rain, drizzle) take over 50.0%
  • The largest value (rain) is over 19.73 times larger than the second largest value (drizzle)

count

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 108.8 KB
Mean 4.7544
Minimum 1
Maximum 26
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • count is skewed right (γ1 = 1.1771)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 2
Median 4
Q3 7
95-th Percentile 11
Maximum 26
Range 25
IQR 5

Descriptive Statistics

Mean 4.7544
Standard Deviation 3.4421
Variance 11.8479
Sum 33119
Skewness 1.1771
Kurtosis 1.565
Coefficient of Variation 0.724
  • count is not normally distributed (p-value 1.1206402118023081e-08)
  • count has 93 outliers

Interactions

Correlations

Missing Values